Hierarchical Clustering Of Verbs
نویسندگان
چکیده
In this paper we present an unsupervised learning algorithm for incremental concept formation, based on an augmented version of COBWEB. The algorithm is applied to the task of acquiring a verb taxonomy through the systematic observation of verb usages in corpora. Using a Machine Learning methodology for a Natural language problem required adjustments on both sides. In fact, concept formation algorithms assume the input information as being stable, unambiguous and complete. At the opposite, linguistic data are ambiguous, incomplete, and possibly erroneous. A NL processor is used to extract semiautomatically from corpora the thematic roles of verbs and derive a feature-vector representation of verb instances. In order to account for multiple instances of the same verb, the measure of category utility, defined in COBWEB, has been augmented with the notion of memory inertia. Memory inertia models the influence that previously classified instances of a given verb have on the classification of subsequent instances of the same verb. Finally, a method is defined to identify the basic-level classes of an acquired hierarchy, i.e. those bringing the most predictive information about their members.
منابع مشابه
‘Over reference’: a comparative study on German prefix-verbs
• Experiment: Hierarchical clustering of 4 × 10 prefix-verbs on über (over). We extracted vector representations for all items in our dataset (derived and simple verbs) by relying on a state-of-the-art technique (cf. Mikolov et al. [2013] continuous bag-ofwords representation). The distributional semantic model on which our experiment was conducted was extracted from the SdeWac corpus (cf. Faaß...
متن کاملEvaluating Hierarchies of Verb Argument Structure with Hierarchical Clustering
Verbs can only be used with a few specific arrangements of their arguments (syntactic frames). Most theorists note that verbs can be organized into a hierarchy of verb classes based on the frames they admit. Here we show that such a hierarchy is objectively well-supported by the patterns of verbs and frames in English, since a systematic hierarchical clustering algorithm converges on the same s...
متن کاملTowards a Semantic Classification of Spanish Verbs Based on Subcategorisation Information
We present experiments aiming at an automatic classification of Spanish verbs into lexical semantic classes. We apply well-known techniques that have been developed for the English language to Spanish, proving that empirical methods can be re-used through languages without substantial changes in the methodology. Our results on subcategorisation acquisition compare favourably to the state of the...
متن کاملIdentifying Metaphor Hierarchies in a Corpus Analysis of Finance Articles
Using a corpus of over 17,000 financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UPand DOWN-verbs used to describe movements of indices, stocks, and shares. Using measures of the overlap in the argument distributions of these verbs and k-means clustering of their distributions, we advance evidence for the proposal that the metaphors re...
متن کاملGraph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملFeature Extraction of Concepts by Independent Component Analysis
Semantic clustering is important to various fields in the modern information society. In this work we applied the Independent Component Analysis method to the extraction of the features of latent concepts. We used verb and object noun information and formulated a concept as a linear combination of verbs. The proposed method is shown to be suitable for our framework and it performs better than a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1993